Skip to content

CORE-680: Add OBI distro#4468

Merged
damemi merged 5 commits into
odigos-io:mainfrom
damemi:feature/core-680
Apr 27, 2026
Merged

CORE-680: Add OBI distro#4468
damemi merged 5 commits into
odigos-io:mainfrom
damemi:feature/core-680

Conversation

@damemi
Copy link
Copy Markdown
Member

@damemi damemi commented Mar 23, 2026

What this PR does / why we need it:

This adds opentelemetry-ebpf-instrumentation (OBI) as a supported workload instrumentation distro.

The benefit of this is to provide additional coverage and offer users a way to deploy OBI within our platform if they'd like to.

It relies on features we added upstream to OBI in open-telemetry/opentelemetry-ebpf-instrumentation#1321 and open-telemetry/opentelemetry-ebpf-instrumentation#1388 to support dynamic PID selection (as opposed to OBI's static service/exe config approach).

The OBI Instrumenter is a single long-lived routine that handles process events, filters/matches them based on config criteria (ie, which PIDs we want), and attaches shared eBPF programs to those processes. The DynamicSelector we added upstream provides a hook into the Instrumenter to update the filter/matching criteria during runtime without needing to restart OBI.

This adds OBI as an sdk/distro that can be used with any language. The basic flow of the new distro is:

  1. NewOBIInstrumentationFactory(): Initialize the OBI config and DynamicSelector, but does not start the OBI Instrumenter routine yet. This is to prevent the Instrumenter from running and using resources if nothing is actually instrumented by OBI.
  2. factory.CreateInstrumentation(ctx, pid): If the OBI Instrumenter is not started yet, this starts it and returns an obiInstrumentation handle that stores that DynamicSelector and the instrumented PID. This does not attach the PID to OBI yet.
  3. obiInstrumentation.Load(): Uses the stored PID in the o instrumentation object to call DynamicSelector.AddPIDs(pid) which updates the running OBI manager to include/attach this process.
  4. obiInstrumentation.Close(): Removes the stored PID in o with DynamicSelect.RemovePIDs(pid) to update the OBI manager to exclude/detach this process.

A caveat to this design is that it's difficult to stop the Instrumenter once it's started (ie, all OBI processes get uninstrumented and we no longer need that routine) without complex async management. We could do something like this on Close():

func Close() {
  f.Selector.RemovePIDs(pid)
  if len(f.Selector.GetPIDs()) == 0 {
    cancelObiCtx()
  }
}

But that would require a mutex, and it's possible that another OBI app could come along in the meantime waiting to be instrumented while that mutex is held, and the Instrumenter gets cancelled before the new app gets added. Which would be very confusing to debug. I tried a lot of different approaches and they were all messy. I think some more upstream OBI changes around an actual handle on the instrumenter would help, along with other use cases for holding a reference to the instrumenter itself.

Overall:

  • OBI Factory -- Called once at odiglet startup, runs nothing
  • OBI CreateInstrumentation -- Starts OBI (if necessary) as a singleton and creates the per-process instrumentation used by Odiglet
  • Instrumentation Load -- Attaches the PID to the OBI singleton

Another possible upstream contribution would be a signal from OBI for when it's actually started/running/ready to accept new PIDs which Load() could wait on

This whole approach is slightly different from languages like Go, where the Odiglet is the long lived process, so each CreateInstrumentation() in Go calls the go-auto framework to load, attach, and manage. In the case of OBI, the OBI Instrumenter handles loading, attaching, and managing so Odiglet is a wrapped layer on top of that for integration with our control plane.

This also adds OBI ebpf generation to the odiglet Dockerfile using the upstream [obi-generator[(https://github.com/open-telemetry/opentelemetry-ebpf-instrumentation/pkgs/container/obi-generator)

The OBI distro is added as its own module so it can be imported into enterprise easier: https://github.com/odigos-io/odigos-enterprise/pull/2577 (see that PR for details on OBI distro module)

UI Changes here: https://github.com/odigos-io/ui-kit/pull/748

New C++ app in simple-demo for an e2e test here: odigos-io/simple-demo#67

Changelog entry: Does this PR introduce a user-facing bug fix, feature, dependency update, or breaking change??

feat: Support opentelemetry-ebpf-instrumentation (OBI) for workload instrumentation

@damemi damemi force-pushed the feature/core-680 branch 3 times, most recently from b8375fc to 13ac9be Compare March 26, 2026 00:07
@damemi damemi force-pushed the feature/core-680 branch 4 times, most recently from 6445339 to 7321f97 Compare April 8, 2026 20:00
@damemi damemi added the operator-rbac-approved Bypass the Operator RBAC check label Apr 8, 2026
@damemi damemi force-pushed the feature/core-680 branch 8 times, most recently from 49cbb8e to fe069ae Compare April 9, 2026 20:25
@damemi damemi marked this pull request as ready for review April 9, 2026 20:31
@damemi damemi force-pushed the feature/core-680 branch from fe069ae to 3b02a7b Compare April 9, 2026 20:33
Comment thread api/k8sconsts/device.go Outdated
Comment thread distros/distro/oteldistribution.go Outdated
Comment thread distros/yamls/opentelemetry-ebpf-instrumentation.yaml
Comment thread distros/yamls/opentelemetry-ebpf-instrumentation.yaml Outdated
Comment thread distros/yamls/opentelemetry-ebpf-instrumentation.yaml Outdated
Comment thread odiglet/pkg/ebpf/sdks/obi/obi.go
- ''
resources:
- pods
- services
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this required?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OBI uses service discovery out of the box, not sure if it's specifically for trace decoration or just cluster wide discovery. I'll check if it's configurable, otherwise the odiglet logs a bunch of permission errors from OBI

Comment thread instrumentor/controllers/agentenabled/sync.go Outdated
Comment thread instrumentor/controllers/agentenabled/sync.go Outdated
Comment thread odiglet/pkg/ebpf/process_details.go Outdated
@damemi damemi force-pushed the feature/core-680 branch 3 times, most recently from 03d748c to b8ef2d1 Compare April 22, 2026 14:46
Comment thread odiglet/pkg/ebpf/sdks/obi/obi.go Outdated
Comment on lines +22 to +24
func obiLogger() *commonlogger.OdigosLogger {
return commonlogger.LoggerCompat().With("subsystem", "ebpfobi")
}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

think this will create a new logger instance on every call?
maybe worth to consider putting the logger in the factory

In addition some of the logging might be too verbose and in general it is better to let the caller (the instrumentation manager) handle any logging for errors or meaningful events if possible

Comment thread odiglet/pkg/ebpf/sdks/obi/obi.go Outdated
Comment thread odiglet/pkg/ebpf/sdks/obi/obi.go Outdated
Comment thread odiglet/pkg/ebpf/sdks/obi/obi.go Outdated
Comment thread odiglet/pkg/ebpf/sdks/obi/obi.go Outdated
type obiInstrumentation struct {
selector *discover.DynamicPIDSelector
pid int
factory *OBIInstrumentationFactory
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be cleaner to either:

  1. have the factory pass a cleanup function
  2. have the factory pass a context and cancel function

Instead of each instrumentation object having a pointer to the factory.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried that but we need to be able to write back to the factory somehow when an Instrumentation object cancels the OBI instrumenter so that the factory knows to restart the OBI instrumenter on the next instrumentation

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@damemi why is it not possible with a cleanup function?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@blumamir the cleanup function would still need to have a reference back to the factory anyway (to signal that the next instrumentation should re-start the instrumenter), so it's essentially just a closure with the logic that's already here

Comment thread odiglet/pkg/ebpf/sdks/obi/go.mod Outdated
Comment thread distros/yamls/opentelemetry-ebpf-instrumentation.yaml Outdated
Comment thread odiglet/go.mod Outdated
Comment thread odiglet/go.mod
Comment on lines +20 to +30
go.opentelemetry.io/otel v1.43.0
go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracegrpc v1.43.0
go.opentelemetry.io/otel/metric v1.43.0
go.opentelemetry.io/otel/sdk v1.43.0
golang.org/x/sync v0.20.0
golang.org/x/sys v0.42.0
google.golang.org/grpc v1.79.3
google.golang.org/grpc v1.80.0
google.golang.org/protobuf v1.36.11
k8s.io/api v0.35.2
k8s.io/apimachinery v0.35.2
k8s.io/client-go v0.35.2
k8s.io/api v0.35.3
k8s.io/apimachinery v0.35.3
k8s.io/client-go v0.35.3
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought the point of making obi a separate module was so it won't make us align to its dependencies - if that's the case I'd expect to not have all these changes in this PR - or if they are un avoidable. - to remove the separate module for obi

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the point of the separate module is so we can import into enterprise without getting the ambiguous go-auto import

@damemi damemi force-pushed the feature/core-680 branch from 4216c2e to 73372d4 Compare April 24, 2026 17:19
Copy link
Copy Markdown
Collaborator

@RonFed RonFed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, still think we need to checkout consuming the archive OBI publishes thay you shared with me before merging with the current Dockerfile/Makefile changes

Copy link
Copy Markdown
Collaborator

@blumamir blumamir left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added few comments/suggestions, but overall looks good.
would love to chat about it before merging

Comment thread common/lang_detection.go
Comment on lines +44 to +47
// IsProgrammingLanguageWildcard reports whether lang is the distro wildcard meaning "any language".
func IsProgrammingLanguageWildcard(lang ProgrammingLanguage) bool {
return strings.TrimSpace(string(lang)) == string(ProgrammingLanguageWildcard)
}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for other languages we compare directly with == instead of adding a a util function. why is it different for the * language?

Comment thread distros/yamls/opentelemetry-ebpf-instrumentation.yaml
Comment thread distros/distro/oteldistribution.go
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we also have java-ebpf-instrumentations in enterprise. should we use the same here (instrumentations with an s at the end) to keep things consistent?

also, since the mechanism is a bit different, I wonder if this can cause confusion for users

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so, I named this literally after OBI "opentelemetry-ebpf-instrumentation" the project itself for clarity https://github.com/open-telemetry/opentelemetry-ebpf-instrumentation

Comment on lines +87 to +89
if common.IsProgrammingLanguageWildcard(distro.Language) {
continue
}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could be nice to add support for this in the rule someday in the duture

Comment thread odiglet/pkg/ebpf/sdks/obi/obi.go
Comment thread odiglet/go.mod
Comment on lines +44 to +57
github.com/aws/aws-sdk-go-v2 v1.41.3 // indirect
github.com/aws/aws-sdk-go-v2/config v1.32.11 // indirect
github.com/aws/aws-sdk-go-v2/credentials v1.19.11 // indirect
github.com/aws/aws-sdk-go-v2/feature/ec2/imds v1.18.19 // indirect
github.com/aws/aws-sdk-go-v2/internal/configsources v1.4.19 // indirect
github.com/aws/aws-sdk-go-v2/internal/endpoints/v2 v2.7.19 // indirect
github.com/aws/aws-sdk-go-v2/internal/ini v1.8.5 // indirect
github.com/aws/aws-sdk-go-v2/service/internal/accept-encoding v1.13.6 // indirect
github.com/aws/aws-sdk-go-v2/service/internal/presigned-url v1.13.19 // indirect
github.com/aws/aws-sdk-go-v2/service/signin v1.0.7 // indirect
github.com/aws/aws-sdk-go-v2/service/sso v1.30.12 // indirect
github.com/aws/aws-sdk-go-v2/service/ssooidc v1.35.16 // indirect
github.com/aws/aws-sdk-go-v2/service/sts v1.41.8 // indirect
github.com/aws/smithy-go v1.24.2 // indirect
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could be nice not to pull these. I guess obi is a bundle of everything and cannot be composed of just what needed?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not today unfortunately

Comment thread odiglet/Makefile Outdated

debug-build-odiglet: generate
go build -o odiglet -gcflags "all=-N -l" cmd/main.go
CGO_ENABLED=0 go build -a -o odiglet -gcflags "all=-N -l" cmd/main.go
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
CGO_ENABLED=0 go build -a -o odiglet -gcflags "all=-N -l" cmd/main.go
$(GOBUILD) -o odiglet -gcflags "all=-N -l" cmd/main.go

type obiInstrumentation struct {
selector *discover.DynamicPIDSelector
pid int
factory *OBIInstrumentationFactory
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@damemi why is it not possible with a cleanup function?

@damemi damemi force-pushed the feature/core-680 branch 5 times, most recently from 7205983 to 46f4458 Compare April 27, 2026 12:59
@damemi damemi force-pushed the feature/core-680 branch from 46f4458 to 92f0ca1 Compare April 27, 2026 13:00
@damemi damemi merged commit 60b72b5 into odigos-io:main Apr 27, 2026
79 of 80 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

operator-rbac-approved Bypass the Operator RBAC check

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants